Dominant and K Nearest Probabilistic Skylines
نویسندگان
چکیده
By definition, objects that are skyline points cannot be compared with each other. Yet, thanks to the probabilistic skyline model, skyline points with repeated observations can now be compared. In this model, each object will be assigned a value to denote for its probability of being a skyline point. When we are using this model, some questions will naturally be asked: (1) Which of the objects have skyline probabilities larger than a given object? (2) Which of the objects are the K nearest neighbors to a given object according to their skyline probabilities? (3) What is the ranking of these objects based on their skyline probabilities? Up to our knowledge, no existing work answers any of these questions. Yet, answering them is not trivial. For just a medium-size dataset, it may take more than an hour to obtain the skyline probabilities of all the objects in there. In this paper, we propose a tree called SPTree that answers all these queries efficiently. SPTree is based on the idea of space partition. We partition the dataspace into several subspaces so that we do not need to compute the skyline probabilities of all objects. Extensive experiments are conducted. The encouraging results show that our work is highly feasible.
منابع مشابه
Threshold Phenomena in k-Dominant Skylines of Random Samples
Skylines emerged as a useful notion in database queries for selecting representative groups in multivariate data samples for further decision making, multi-objective optimization or data processing, and the k-dominant skylines were naturally introduced to resolve the abundance of skylines when the dimensionality grows or when the coordinates are negatively correlated. We prove in this paper tha...
متن کاملContinuous Probabilistic Skyline Queries over Uncertain Data Streams
Recently, some approaches of finding probabilistic skylines on uncertain data have been proposed. In these approaches, a data object is composed of instances, each associated with a probability. The probabilistic skyline is then defined as a set of non-dominated objects with probabilities exceeding or equaling a given threshold. In many applications, data are generated as a form of continuous d...
متن کاملProbabilistic Skylines on Uncertain Data
Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain data remains an open problem at large. In this paper, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic...
متن کاملAn Indoor Positioning System Based on Wi-Fi for Energy Management in Smart Buildings
To offer indoor services to occupants in the context of smart buildings, it is necessary to consider information concerning to the identity and location of the occupants. This paper proposes an indoor positioning system (IPS) based on Wi-Fi fingerprint and K-nearest neighbors (KNN) method. The positioning of a mobile device (MD) using Wi-Fi technology involves online and offline phases. In this...
متن کاملFUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA
Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.
متن کامل